Network graph parser

ABSTRACT

An approach for processing node data from code repository websites to generate patterns is disclosed. Node data can be parsed from a projects webpage or received from a code repository server hosting the repository website. Visualizations can be generated in a browser from the node data. The visualizations can be displayed within the browser and further be used to receive filter instructions. Refined node data can then be exported for further analysis.

PRIORITY APPLICATION

This application is a continuation of U.S. patent application Ser. No.15/642,820, filed Jul. 6, 2017, which claims priority to U.S.Provisional Patent Application Ser. No. 62/448,081, filed Jan. 19, 2017,the disclosure of which are incorporated herein in their entireties byreference.

TECHNICAL FIELD

Embodiments of the present disclosure relate generally to patterndetection and, more particularly, but not by way of limitation, tomanipulating data via a network graph parser to expose previouslyundetected patterns.

BACKGROUND

A code repository website allows users to publish software code projectsto the website so that other users can access, view, edit, or otherwiseuse the published software code. Identifying how different projects(e.g., software coding projects) are related to one another is currentlyimpractical because the project data on the code repository websites islargely unstructured.

BRIEF DESCRIPTION OF THE DRAWINGS

Various ones of the appended drawings merely illustrate exampleembodiments of the present disclosure and should not be considered aslimiting its scope.

FIG. 1 is a block diagram illustrating a networked system in which anetwork graph parser can be implemented, according to some exampleembodiments.

FIG. 2 is a block diagram showing functional components provided withinthe network graph parser, according to some example embodiments.

FIG. 3 shows a flow diagram for generating node data for export,according to some example embodiments.

FIG. 4 shows a flow diagram for parsing node data from multiple selectedentities, according to some example embodiments.

FIGS. 5A and 5B show example visualizations of node data, according tosome example embodiments.

FIG. 6 shows a flow diagram for selecting entities, according to someexample embodiments.

FIG. 7 shows example visualizations from node data of different selectedentities, according to some example embodiments.

FIG. 8 shows an example flow diagram for processing entity data,according to some example embodiments.

FIG. 9 shows example visualizations, according to some exampleembodiments.

FIG. 10 shows an example flow diagram for receiving filter instructions,according to some example embodiments.

FIG. 11 shows example visualization and user interface elements forfiltering node data, according to some example embodiments.

FIG. 12 shows a flow diagram for filtering node data, according to someexample embodiments.

FIGS. 13A and 13B show example visualization and user interface elementsfor filtering node data, according to some example embodiments.

FIG. 14 shows an example flow diagram for analysis and export of nodedata, according to some example embodiments.

FIGS. 15A and 15B show example user interfaces for processing networkgraphs using a network graph parser, according to some exampleembodiments.

FIG. 16 illustrates a diagrammatic representation of a machine in theform of a computer system within which a set of instructions may beexecuted for causing the machine to perform any one or more of themethodologies discussed herein, according to an example embodiment.

DETAILED DESCRIPTION

The description that follows includes systems, methods, techniques,instruction sequences, and computing machine program products thatembody illustrative embodiments of the disclosure. In the followingdescription, for the purposes of explanation, numerous specific detailsare set forth in order to provide an understanding of variousembodiments of the inventive subject matter. It will be evident,however, to those skilled in the art, that embodiments of the inventivesubject matter may be practiced without these specific details. Ingeneral, well-known instruction instances, protocols, structures, andtechniques are not necessarily shown in detail.

In various example embodiments, a network graph parser is implemented toparse data from websites (e.g., code repository websites) into humanunderstandable patterns. According to some example embodiments, the coderepository websites are websites or network-based publication platforms(e.g., Internet forums) that allow users to publish data viewable byother users of the website or platform. For example, a softwaredeveloper can create a project page on a code repository site andpublish his/her code for the project to the project page. Other uses maynavigate to the project page, view, download, or modify the code for theprojects.

According to some example embodiments, the network graph parser isinstalled as a browser plugin of an Internet browser application. A dataanalyst may navigate to a given page on a repository website projects,such as a page created or associated with the project or a contributor.The analyst may then trigger the parse operation by selecting a browserplugin button. The parse operation goes through the page and saves dataon the page and on related pages. For example, the network graph parsermay identify links to projects listed on the repository website. In someembodiments, the network graph parser may navigate to each of theprojects.

The saved data may be used to generate a visual representation (e.g., anetwork graph) of the collected data. The data analyst may manipulatethe visual representation to explore patterns. Further, the data analystmay hone down onto specific subsets by issuing filter instructions. Forexample, the data analyst may filter out any connections that don't haveat least two connections to other nodes. Contributors may haveconnections to one another by working together on the same codingproject, as an example. The various filter instructions exposepreviously invisible patterns in the network graph. The honed down datacontaining the pattern can then be exported over a network to a dataanalysis server for further analysis, according to some exampleembodiments.

FIG. 1 is a block diagram depicting a networked system 100 comprising anelectronic device 110, and one or more components external to theelectronic device 110. These external components include a databasesystem 10, network 120, and a plurality of repository servers 130-1 to130-n, that host repository websites. According to some exampleembodiments, the electronic device 110 is a client device, such as apersonal computer, a tablet computer, a personal digital assistant(PDA), a mobile phone, a smart-phone, or any other web-enabled computingdevice with a processor and a memory. The electronic device 110 hasinstalled thereon a web browser application (e.g., web browser 1632 inFIG. 16), on which is installed a network graph parser. According tosome example embodiments, the network graph parser is integrated intothe web browser application as a plugin or browser extension. Each ofthe plurality of repository servers 130-1 to 130-n comprises hardwareand software. Each of the plurality of repository servers 130-1 to 130-nis able to communicate with the electronic device 110 via the network120.

In some embodiments, some of the plurality of repository servers 130-1to 130-n can be a part of a cloud, which can include, for example, oneor more networked servers. Such networked servers may be termed a datacenter or a server farm. Such data centers currently are maintained byvarious communication network service providers. Network 120 can be, forexample, the Internet, an intranet, a local area network, a wide areanetwork, a campus area network, a metropolitan area network, anextranet, a private extranet, or a combination of any of these or otherappropriate networks.

For the exemplary embodiment of FIG. 1, it is understood that theelectronic device 110 is separate from the external database system 10but connected thereto by a link. Alternatively, the database system 10may be disposed in an air-gapped, high-side environment, where thedatabase system 10 is physically isolated from the network 120 and theelectronic device 110, such that a higher level of classifiedinformation can be maintained in the database system 10.

The electronic device 110 may be implemented by one or more speciallyconfigured computing devices. The electronic device 110 may behard-wired to perform the operations, techniques, etc. described herein.The electronic device 110 can include digital electronic devices such asone or more application-specific integrated circuits (ASICs) or fieldprogrammable gate arrays (FPGAs) that are persistently programmed toperform the operations, techniques, etc. described herein. Theelectronic device 110 can include one or more general purpose hardwareprocessors (including processor circuitry) programmed to perform suchfeatures of the present disclosure pursuant to program instructions infirmware, memory, other storage, or a combination. The electronic device110 can also combine custom hard-wired logic, ASICs, or FPGAs withcustom programming to accomplish the methods and other features.

The electronic device 110 can be generally controlled and coordinated byoperating system software, such as iOS, Android, Blackberry, Chrome OS,Windows XP, Windows Vista, Windows 7, Windows 8, Windows Server, WindowsCE, Unix, Linux, SunOS, Solaris, VxWorks, or a proprietary operatingsystem. The operating system controls and schedules computer processesfor execution, perform memory management, provide file system,networking, I/O services, and provides a user interface functionality,such as a graphical user interface (“GUI”), among other things.

FIG. 2 shows internal functional components of the network graph parser234, according to some example embodiments. In the illustratedembodiment, the network graph parser is implemented as a plug-in orbrowser extension for a web browser. As illustrated, the network graphparser 234 comprises an interface engine 210, a parse engine 220, a nodedata engine 230, a visualization engine 240, and an export engine 250.The interface engine 210 is configured to interface with the browser1632 as a plugin. Further, the interface engine 210 is configured tointerface with entities outside the electronic device 110, such as therepository server 130. The parse engine 220 is configured to parse nodedata from a code project webpage. Node data is data of an objectassociated with a given project. In some example embodiments, the nodedata is user data (e.g., data of software developers) associated with agiven project, where each user may be involved in several differentsoftware projects. In some example embodiments, the node data includescode portions (e.g., classes, functions) shared between differentprojects. For example, two different projects may share an OpticalCharacter Recognition (OCR) class, and the OCR class code can be used asa node associated with each software project's network graph, asdiscussed in further detail below. As a further example, according tosome example embodiments, metadata describing an object (e.g., codeportion) may be used as a node associated with each software project'snetwork graph.

In some example embodiments, where the repository website is configuredto provide node data, the parse engine 220 is configured to send nodedata requests for the users of the repository website. The repositorywebsite can receive the requests and issue responses including therequested node data. The node data engine 230 is configured to processthe node data received via the code projects webpage (e.g., viaspidering) or received from the repository website. The node data engine230 can receive filter instructions from a user and cull (e.g., refine)the node data by removing data of users that do not meet therequirements of the filter instruction, as explained in further detailbelow. The visualization engine 240 is configured to use the initialnode data or the refined node data and generate different types ofvisualizations for display on the display screen of the electronicdevice 110. The visualizations may include a network graph, a histogram,graphs such bar charts or data plots, and other visualizations. Theexport engine 250 is configured to export the refined dataset to ananalysis server for further analysis.

FIG. 3 is a flowchart representing an exemplary method 300 performed byan electronic device for collecting and analyzing data from repositorywebsite systems, according to some example embodiments. While theflowchart discloses the following operations in a particular order, itwill be appreciated that at least some of the operations can be moved,modified, or deleted where appropriate, consistent with the teachings ofthe present disclosure. In the depicted embodiment of FIG. 3, a user canutilize an electronic device (e.g., electronic device 110) thatcomprises a web browser 1632, for example, Google™ Chrome™, Mozilla™Firefox™, Microsoft™ Internet Explorer™, etc. The web browser 1632 isusable to access web content (e.g. provided by the repository servers130-1 to 130-n) via a network (e.g., network 120), such as the Internetor an intranet.

At operation 310, a network graph parser 234 is installed as a plugin inthe web browser 1632 of the electronic device. The network graph parser234 may be termed a browser extension, according to some exampleembodiments. The network graph parser 234 extends the functionality ofthe web browser 1632, as is described in detail below. The network graphparser 234 may be authored using a web technologies such as HTML,JavaScript, or CSS (Cascading Style Sheets).

Referring again to FIG. 3, at operation 320, the interface engine 210accesses communication and node data related content from a repositoryserver 130 (e.g., one of the repository servers 130-1 to 130-n) usingthe web browser 1632. In the following description, reference is madegenerally to accessing content from a repository server 130, and it willbe appreciated that, unless the context indicates otherwise, suchreferences are to accessing content from a particular repository server130, for instance the first repository server 130-1.

According to some example embodiments, the repository server 130 isaccessed through the browser 1632 causing sending of a request (e.g. anHTTP request) to the repository server 130 (in particular to a webserverincluded as part thereof.

Once the user has accessed the repository server 130 using the browser1632, they may control the browser 1632 to interact with the repositoryserver 130 using user interface controls provided in the browser 1632 bythe network graph parser 234 or using controls provided by the browseritself. In some example embodiments, the information received from therepository webservice comprises a projects webpage showing differentcoding or software projects associated with a user of the repositorywebservice.

At operation 330, the parse engine 220 parses the data from the projectswebpage and stores the data in local memory of the electronic device110. According to some example embodiments, node data is a user profileand relates to an entity that is included as part of the repositorynetwork service provided by the repository server 130. Further,according to some example embodiments, an entity typically relates to anindividual programmer, but may relate to an organization, for instance abusiness or other group. In some example embodiments, a softwaredeveloper profile includes at least a unique identifier (the identifieruniquely identifies the entity on the repository service), a name forthe entity (typically a string of text, perhaps alphanumeric characters)and a plurality of links between the entity and other entities that formpart of the repository webservice.

The links may be bidirectional in nature. For example, two softwaredevelopers may collaborate on the same code project. Because the twodevelopers work on the same coding project, they may be bidirecitonallylinked under the assumption that each knows of the other as a fellowcoder (e.g., team member, colleague) on the project. The links mayalternatively be unidirectional, e.g., the first software developerreceives updates published by the second software developer but thesecond software developer does not receive updates published by thefirst software developer. In some embodiments, the data stored on therepository website indicates the type of communication activity betweenthe users. For example, the node data may include an indication that afirst user commented on a pending code update on a project page on therepository website. The links may indicate the another entity byincluding a identifier that is unique to the other entity. Typically,repository webservices provide an identifier that is an alphanumericstring. The string may be known to the entity and other users (e.g. itmay be their username) or it may be a system-generated identifier whichdoes not need to be known to the user (e.g. a string such as“exampleidentifier$43*”). The profile may also include a uniformresource locator (URL) that is unique to the entity.

A user profile of the repository website may also have other informationassociated with pre-defined fields, for instance ‘high school attended’,‘place of residence’, ‘place of work’, ‘undergraduate study subject’,etc. The profile may also have other content such as photographs,videos, comments or profile text, etc. Profile content may be associatedwith particular dates (and as such may appear in a timeline on a user'sprofile page) or may not be dependent on a date (and so may notgenerally appear on a timeline). In some embodiments, profile contentmay be associated with geotagged data.

In some example embodiments, user profiles are imported in response touser input. For example, a first profile is imported by the networkgraph parser 234 in response to the user selecting a first entity in therepository service. This may occur for instance by the user selecting ahyperlink in a code projects webpage provided by the repository server130. The code projects webpage may be provided by the repository server130 in response to the user entering text, e.g. the whole or part of aname on an entity, into a search field of a webpage provided by the bythe repository server 130. The code projects webpage displays codingprojects of the first entity, where each of the coding projects has itsown projects webpage, which can be spidered as described above.According to some example embodiments, upon selection of the firstentity, the network graph parser 234 sends a request to the repositoryserver 130 identifying the first entity. In response, the repositoryserver 130 provides the code projects webpage of the first entity, whichis then parsed by the network graph parser 234. In some exampleembodiments, the network graph parser 234 extracts node data from thecode projects webpage by accessing the source code (e.g., markuplanguage) of the code projects webpage and then extracting the node datalisted in the source code. The received node data is stored in volatilememory (e.g. RAM) allocated to the browser 1632, but is not stored inpermanent memory, e.g. ROM.

After the node data of the first entity is imported by the network graphparser 234, or at least after importation has begun, the user selects asecond entity. This may occur for instance by the user selecting ahyperlink relating to the second entity in a second code projectswebpage provided by the repository server 130. Upon selection of thesecond entity, the network graph parser 234 sends a request to therepository server 130 identifying the second entity. In response, therepository server 130 provides a second code projects webpage that listsall the coding projects for the second entity on the repository server.The parse engine 220 then parses the source code of the second codeprojects webpage to extract additional node data of the users associatedwith the second entity (e.g., users that have worked on the same codingproject as the second entity). The received profile is stored involatile memory (e.g. RAM) allocated to the browser 1632, but is notstored in permanent memory, e.g. ROM.

According to some example embodiments, the network graph parser 234 isconfigured to automatically import node data for entities to which thefirst and second entities are linked, e.g., for which links from thefirst and second entities exist. The parse engine 220 is configured toimport such node data by sending requests to the repository server 130identifying the further entities and navigating to the code projectswebpages of the entities.

At operation 235, the node data engine 230 transforms the contributordata parsed from the projects webpage from a first format into a secondformat. For example, the underlying source code of the projects webpagemay be a markup language, such as HTML. The node data parsed from theprojects webpage may also be in the markup language format. The nodedata engine 230 is configured to transform the node data from the markuplanguage format to an attribute-value format, such as JSON (JavaScriptObject Notation). The node data in the second format can be used forfiltering and generation of the visualizations.

At operation 340, the visualization engine 240 creates a visualrepresentation from the parsed node data (e.g., node data in theattribute-value format). In some example embodiments, the visualrepresentation is generated as a network graph in an additional tab ofthe browser 1632. The network graph includes a collection of nodesconnected by edges. Each node corresponds to a user from one of theprojects listed on a code projects webpage, and connections betweenindividual nodes may be visually represented as lines, for examplestraight lines. In some example embodiments, two nodes are connected onthe repository server if each of the nodes are associated with the samecoding project. The graph may lend itself to be further processed,analyzed and manipulated by an analyst or other user. The detailsregarding operation 340 are explained in more detail later.

At operation 350, the export engine 250 exports the graph formed fromthe operation 340 to the database system 10. The database system 10 isconnected to the electronic device 110 (as shown in FIG. 1), accordingto some example embodiments. Further, according to some exampleembodiments, the database system 10 is implemented as a backend systemdisposed in an air-gapped, high-side environment, separated from thenetwork 120 and the electronic device 110. The database system 10 may bededicated to receive data for further analysis. Therefore, network graphparser 234 of operation 340 can be used to collect and pre-process thedata such that it is compatible with the database system 10.

FIG. 4 is a flowchart representing a method 400 performed by the networkgraph parser 234 for importing node data of the first and secondentities from the repository server 130 and for creating a graph,according to some example embodiments. The method 400 is an example ofsub-operations performed to complete operations 330 and 340 of FIG. 3discussed above. FIGS. 5A and 5B show an example of the graph created bythe exemplary method 400.

At operation 410, the interface engine 210 receives selection of firstentity through a user input, for instance through a bookmark, favorite,or through selection of an option provided in a list of search results.At operation 420, the interface engine 210 requests the profile of thefirst entity. This involves the network graph parser 234 accessing therepository server 130 via the network 120 and in particular accessingthe first entity (e.g., projects webpage of the first entity) in therepository server 130. In particular, the network graph parser 234 maysend an HTTP request to the repository server 130, the request includingthe unique identifier of the first entity.

At operation 430, the network graph parser 234 receives the profile orprojects webpage of first entity. The profile is for example received asan HTTP response. According to some example embodiments, the profileincludes a name for the first entity and details of connections of thefirst entity. The connections define links to other entities, andinclude unique identifiers for the other entities. In some exampleembodiments, one or more webpages of the first entity may be exposedthrough automatic scrolling of the one or more webpages. For example, atop portion of a first entity's webpage may be initially retrieved, andfurther portions below the top portions may be auto populated by scriptas those portions are scrolled to. In some example embodiments, the autopopulated scrolled-to portions are received at operation 430.

At operation 440, the visualization engine 240 displays a graph relatingto first entity. For example, the network graph parser 234 may display agroup or ‘cloud’ of nodes, each node relating to an entity. The noderelating to the first entity is displayed with different visiblecharacteristics to nodes for other entities. For instance, it may be adifferent color or size. All the nodes for entities linked to the firstentity are shown as being connected by the inclusion on the graph of aline, e.g. a straight line, connecting the node to the node for thefirst entity. In some embodiments, connections between nodes other thanconnections between the first entity and other nodes may not bedisplayed in the graph.

Further, in some example embodiments, a further entity (e.g., a secondentity of operation 460) need not be specified for links between nodesto be created. For example, an entity associated with a given coderepository page may be identified (e.g., at operation 410). The coderepository page may list other coding projects with which the entity isinvolved (e.g., develops code). Each coding project may list otherfurther entities associated with the given project. Using the identifiedentity, the additional projects and additional entities can allautomatically be included in a single network graph, according to someexample embodiments.

In the following discussion, the terms ‘connected’ and ‘linked’ inrelation to entities included in the electronic repository website canbe used interchangeably. FIG. 5A shows an example of parsing profiles ofusers. In the left panel of FIG. 5A, a node 501 is displayed as an emptycircle. The node 501 corresponds to the first entity. Each of ten nodesdisplayed as a group around the node 501 represents a different entityto which the first entity is linked or connected, as identified from theprofile of the first entity. Each such node in the group (other than thefirst node 501) is connected to the node 501 with a respective straightline, which represents a link between the corresponding two entities. Inthe following, a group of nodes connected to the first node 501 once bya single link in the created graph may be represented as being enclosedwithin a dotted circle, as shown in the right panel of FIG. 5A.

At operation 450, the network graph parser 234 begins requestingprofiles of entities linked to by the first entity. In some exampleembodiments, the profiles are parsed from a code projects webpage of thefirst entity. For example, users associated with the first entity may bedisplayed in a projects webpage. The underlying markup language of thecode projects webpage can be parsed to extract the username, userprofile URL, and other information for each of the users associated withthe first entity.

At operation 455, profiles of the entities are stored as they arereceived. In one embodiment, the profiles are stored in non-volatilememory that is allocated to the browser 1632. Profiles may continue tobe requested and saved as a background task whilst the network graphparser 234 performs other tasks.

At operation 460, the interface engine 210 receives selection of asecond entity. This may occur as described above in relation toreceiving selection of the first entity.

At operation 470, the network graph parser 234 receives the profile ofthe second entity, after requesting the profile of the second entity.The profile is for example received as part of an http response. Theprofile includes at least a name for the second entity and details ofconnections of the second entity. The connections define links to otherentities, and include unique identifiers for the other entities.

At operation 490, the visualization engine 240 displays a graph relatingto the first and second entities. For example, the network graph parser234 may display three groups (or clouds) of nodes 510, 520, 530, eachnode relating to an entity. The nodes 501 and 502 relating to the firstand second entities are displayed with different visible characteristicsto nodes for other entities. For instance, they may be a different coloror size. Each node of the first group 530 of nodes corresponds to anentity linked to in the profiles of both the first and second entities.Each node of the second group 510 of nodes corresponds to an entitylinked to by the profile of the first entity but not by the profile ofthe second entity. Each node of the third group 520 of nodes correspondsto an entity linked to by the profile of the second entity but not bythe profile of the first entity. All the nodes for entities connected tothe first entity are shown as being connected to the node 501 by theinclusion on the graph of a line, e.g. a straight line, connecting thenode to the node for the first entity. All the nodes for entitiesconnected to the second entity are shown as being connected to the node502 by the inclusion on the graph of a line, e.g. a straight line,connecting the node to the node for the second entity. Connectionsbetween nodes other than connections between one of the node 501 and thenode 502 and other nodes are not displayed in the graph.

At operation 490, the visualization engine 240 creates a new graph afterremoving the graph as shown in FIG. 5A. Alternatively, the network graphparser 234 may augment and rearrange the graph created at operation 440.An example of the created graph at operation 490 is shown in FIG. 5B.Nodes 501 and 502 represent the first entity and the second entity,respectively. Third group 510 comprising seven nodes (represented asfilled circles), correspond to the entities linked to by both the firstentity and the second entity. The second group 510 comprise only threenodes because seven nodes previously in the sole group now belong to thefirst group 530. The profile of the second entity includes links tofourteen entities. Seven entities linked to in the profile for thesecond entity belong to the first group 530, and the other sevenentities belong to the third group 520. The nodes of the group from FIG.5A is now split and rearranged into two groups, namely second group 510and first group 530. Therefore, in displaying nodes corresponding to theentities linked to by the first entity 501 and second entity 502,accessed by the network graph parser 234, they are grouped into threegroups: the second group 510 linked only to the first entity 501, thethird group 520 linked only to the second entity 502, and the firstgroup 530 linked both to the first entity 501 and second entity 502.

At operation 495, the interface engine 210 begins requesting profiles ofentities linked to by the second entity. In some example embodiments,the operations of 450 and 495 (e.g., requests for profiles of relatedentities) are initiated by a manual user request. For example, after theuser (at operation 420) requests profile of first entity, the user (atoperation 450) further requests (e.g., using a GUI button) the profilesof entities related to the first entity. Further, according to someexample embodiments, the operations of 450 and 495 are performedautomatically by the network graph parser. For example, after the user(at operation 420) requests profile information of the first entity, thenetwork graph parser 234 automatically retrieves and sends the profileinformation of the specified first entity but also retrieves and sendsprofile information of entities related to the first entityautomatically (e.g., without the user manually initiating the requestfor profile information of the related entities).

At operation 497, profiles of the entities are stored as they arereceived. In one embodiment, the profiles are stored in volatile memory,e.g. the RAM 1606, that is allocated to the browser 1632. Profiles maycontinue to be requested and saved as a background task whilst thenetwork graph parser 234 performs other tasks. Further, according tosome example embodiments, the display operations of method 400 (e.g.,operations 440 and 490) are bypassed until the some or all of theinformation collection operations (e.g., operations 410, 420, 430, 450,455, 460, 470, 480, 495, and 497) are completed.

FIG. 6 is a flowchart showing a method 600 performed by the networkgraph parser 234 for importing further seed entities from the electronicrepository website system hosted from the repository server 130 and forcreating a graph, according to some example embodiments. This maycorrespond to at least part of operations 330 and 340 of FIG. 3, asdiscussed above.

FIG. 7 shows an example of the graph created by the exemplary method600. Prior to operation 610, the plugin is processing two or moreentities, for instance as is shown in FIG. 5B and is present at the endof the flowchart of FIG. 4. At operation 610, the interface engine 210receives selection of further entity through a user input, for instancethrough a bookmark, favorite or through selection of an option providedin a list of search results.

At operation 620, the interface engine 210 requests or parses theprofile of the further entity. This is similar to operation 420. Thisinvolves the interface engine 210 accessing the repository server 130via the network 120 and accessing first entity in one of the electronicrepository webservice system in the repository server 130. Inparticular, the interface engine 210 may send an HTTP request to therepository server 130, the request including the unique identifier ofthe first entity. Alternatively, the network connection parser can parsea code projects webpage to extract profile information of user connectedto the second entity.

At operation 630, the interface engine 210 receives the profile of thefurther entity. This is similar to operation 430. The profile is forexample received as part of an HTTP response. The profile includes atleast a name for the first entity and details of connections of thefurther entity. The connections define links to other entities, andinclude unique identifiers for the other entities.

At operation 640, the visualization engine 240 displays a graph relatingto all the selected entities. Here, the visualization engine 240 maycause display of multiple groups (or clouds) of nodes, each noderelating to an entity. Each group relates to a collection of nodes thathave the same connections to the selected entities. Where there arethree selected entities, there are seven groups. Each node of the firstgroup of nodes corresponds to an entity linked to in the profiles ofboth the first and second entities, but not the third entity. Each nodeof the second group of nodes corresponds to an entity linked to by theprofile of the first entity but not by the profile of the second orthird entities. Each node of the third group of nodes corresponds to anentity linked to by the profile of the second entity but not by theprofile of the first or third entities. Each node of the fourth group ofnodes corresponds to an entity linked to in the profiles of both thefirst and third entities, but not the second entity. Each node of thefifth group of nodes corresponds to an entity linked to by the profileof the second and third entities but not by the profile of the firstentity. Each node of the sixth group of nodes corresponds to an entitylinked to by the profiles of the second and third entities but not bythe profile of the first entity. Each node of the seventh groupcorresponds to an entity linked to by each of the first, second andthird entities. One or more of the groups may not exist, if there are nonodes that meet the criteria for that group (these groups might be saidto have zero nodes).

The nodes relating to the selected entities are displayed with differentvisible characteristics to nodes for other entities. For instance, theymay be a different color or size. All the nodes for entities connectedto the one of the selected entities are shown as being connected by theinclusion on the graph of a line, e.g. a straight line, connecting thenode to the node for the selected entity. Where a non-selected node haslinks to multiple selected entities, there is a line for each suchconnection. In some embodiments, connections between two nodes thatrelate to non-selected entities may be hidden or not displayed in thegraph. In some example embodiments, the graph may simplify or de-clutterthe graph by hiding links between nodes and/or nodes based upon whethera give node or one of its neighbors is selected. For example, if theuser selects a given node, the visualization engine may only displaynotes that are directly linked to the given node.

At operation 650, the interface engine 210 begins requesting profiles ofentities linked to by the further entity. In some example embodiments,the user manually requests the profiles of entities linked to by thefurther entity. At operation 660, profiles of the entities are stored asthey are received. In one embodiment, the profiles may be stored involatile memory that is allocated to the browser 1632. Profiles maycontinue to be requested and saved as a background task whilst thenetwork graph parser 234 performs other tasks. At operation 670, theoperation may check whether another entity has been selected by theuser. If so, the operation returns to operation 620, where the profilefor the further selected entity is requested. Further, in some exampleembodiments, the selections of additional entities are processed inbatches. For example, instead of requesting information of a singlefurther entity and then receiving the information of the single furtherentity (e.g., method 600), the user can select a plurality of entities,then request their information as a batch process (e.g., as part of asingle request).

Further, according to some example embodiments, the display operation ofmethod 600 may be bypassed or delayed until other operations arecomplete. For example, operation 650 (an information collection relatedoperation) may be performed before operation 640 (a display relatedoperation). As a further example, the information collected at operation650 may be stored to memory and operation 640 is bypassed and a displayis never generated).

FIG. 7 shows a screenshot of an example of graph generated by thevisualization engine 240, according to some example embodiments. Here,six nodes 701, 702, 703, 704, 705 and 706 correspond to six entitiesthat have been selected by a user. The nodes 701, 702, 703, 704, 705 and706 corresponding to user-selected entities are displayed as emptycircles. Entities linked to by the selected entities are represented asnodes, and are displayed as filled circles. A line connects each nodepair representing linked entities if at least one of the linked entitiesis a user-selected entity. I

It can be seen from FIG. 7 that nodes are grouped together depending onwhich one combination of the six user-selected entities they are linkedto. For example, the group of nodes 711 correspond to non user-selectedentities linked to two of the user-selected entities 701 and 705. Node712 corresponds to the only non user-selected entity linked to by theuser-selected entities 702, 703 and 706, hence the node 712 forms agroup on its own. The group of nodes 713 includes nodes relating toentities linked to user-selected entities 703 and 706. Each group isdisplayed separated from other groups, e.g. with a gap between thegroups which is visibly significantly larger than the gaps betweenadjacent nodes forming part of a single group. Each group is displayedseparately from one another to aid visual recognition of groupsrepresenting different states of connection. In some exampleembodiments, each node is generated from node data from the same coderepository website. In some example embodiments, some of the nodes aregenerated from node data from a first code repository website and someof the nodes are generated from node data from a second code repositorywebsite different from the first. In this way, an analyst user candetermine relationships between nodes (e.g., software project data)across different code repository websites.

FIG. 8 is a flowchart representing an exemplary method 800 performed bythe network graph parser 234 to generate a list in the form of ahistogram from the imported profiles. The method 800 is performed when agraph relating to at least one selected entity is provided for displayby the network graph parser 234 and when the profiles for all of theselected entities and the entities linked to the selected entities havebeen received from the repository server 130. The histogram may beprovided in response to a user input selecting a histogram option, forinstance through interaction with a user interface element in a sidebar,dock, pull-down menu etc.

FIG. 9 shows an example of a graph 900 generated by the method 600. Italso shows an example of a histogram 990 created by the exemplary method800. It further shows a profile viewer 995 generated by selecting a nodedisplayed in the graph 900. The graph 900, the histogram 990, and theprofile viewer 995 are displayed at the same time on different parts ofthe display 212, for instance in the layout shown in the Figure. Thegraph 900 has been generated from the imported profiles of threeuser-selected entities, corresponding to displayed nodes 901, 902 and903. As explained in FIG. 5a and the accompanying paragraphs above, thecircles 910, 920 and 930 represents first, second, third group of nodescorresponding to entities directly linked only to the user-selectedentity nodes 901, 902, 903, respectively. There are three more groups ofnodes 904, 906, 908, which correspond to entities linked to only two ofthe selected entities 901, 902 and 903. There is one group 905 of nodeslinked to all three of the selected entities 901, 902 and 903.

At operation 810, the network graph parser 234 selects one of the fieldsof a profile relating to one of the selected entities 901, 902, 903. Inthis example, the profile contains fields of information common to allor many of the profiles such as place of birth, birth year, high school,and place of work.

At operation 820, the node data engine 230 then searches in all orselected imported profiles for profiles which have the same informationin the same field. In particular, the node data engine 230 identifieswhich fields of the profile of the selected entity are populated. For apopulated field, the plugin extracts the information (text, numbers ortext and numbers) from the profile and searches the corresponding fieldof all the other profiles for the same information. Since the profilesfor the entities are stored in the volatile memory allocated to thebrowser 1632, this searching can be relatively fast.

At operation 830, the node data engine 230 generates a record indicatingany other entity which has the same information in the same field of theprofile. The record is made in the working (volatile) memory 206allocated to the web browser 1632.

At operation 840, the node data engine 230 determines whether there areother fields in the profile for the selected entity that includeinformation and that have yet to be processed. If there are such otherfields, then the method proceeds to operation 850, where another fieldis selected, before the method returns to operation 820. If all thefields have been processed, the method proceeds to operation 860.

At operation 860, the node data engine 230 determines whether all theselected entities have been processed. If not, then the next entity isselected for processing at operation 870 and the method then returns tooperation 810. If so, then at operation 880 the visualization engine 240generates a histogram from the processed data. According to someembodiments, operation 880 is reached only when all completed fields forthe selected entities (the entities which have been selected by a userin the method 300, the method 400 or the method 600).

According to some example embodiments, operation 880 involvesidentifying counting the number of profiles with the same information inthe same field, and forming a list. The list may ordered according tothe count of profiles or by a value of the field. Following operation880, the histogram is displayed on a display screen of electronic device110 at operation 890. Operations 810 to 880 may be performed by thenetwork graph parser 234 without the user having requested a histogram,according to some example embodiments. In this case, however, thehistogram may be displayed at operation 890 only in response to theoption having been selected by the user. In FIG. 9, an example of such ahistogram 990 is shown. In this example, the items in the profiledescription information shared by more than one entities in the graphswere A university, B high school, C high school, living in D city,living in E city, working at F company, working at G company andself-employment.

Returning to FIG. 9, at operation 893, the interface engine 210 receivesa user input selecting one of the items. In some embodiments, the userinput may be in the form of the user clicking on the row of thehistogram 990. In some embodiments, the user input may be in the form ofmoving cursors to indicate the desired entry in the histogram. In theexample shown in FIG. 9, the user input has been received for ‘Lives inE city,’ which is shared by five entities corresponding to nodesdisplayed in the graph. At operation 896, in response to this userinput, five nodes corresponding to the five entities sharing the profiledescription information ‘Lives in E city’ are highlighted. The fiveentities are treated as being participants in the “Lives in E city”group”; that is, the user's have the attribute of living in E city. InFIG. 9, the highlighted entities are represented by the differentlycolored nodes 904, 905, 906, 907 and 908.

At any time, any one of the nodes in the graph 900 may be selected bythe user using the input device 214 and the cursor control 216. Onceselected, the profile 995 of the entity corresponding to the nodes maybe displayed near the graph 900. In FIG. 9, for example, when the entitycorresponding to the node 904, which is highlighted due to the fact thatthe profile indicates that the entity ‘lives in E city,’ is selected bythe user, the profile view 995 may be generated and displayed near thegraph 900. The information included in the profile view 995 is presentin the volatile memory allocated to the browser 1632 because the profileinformation was retrieved from the repository server 130 duringperformance of the method 400, the method 600 or the method 800.

FIG. 10 is a flowchart showing a method 1000 performed by the networkgraph parser 234 of the electronic device 110 to provide a searchfacility which can be used to search the profiles of the importedentities, according to some example embodiments. The search facility maybe provided in response to a user input selecting a search facilityoption, for instance through interaction with a user interface elementin a sidebar, dock, pull-down menu etc. At operation 1010, the networkgraph parser 234 may generate a search tool 1150 which can receive auser input for a keyword. In the example of FIG. 11, the keyword ‘Ecity’ is input into a text entry box provided by the search tool 1150.The keyword ‘E city’ corresponds to a group of users that live in thecity called ‘E city”.

FIG. 11 shows an example of a graph 1100 generated by the method 600 andan example of a search tool 1150 generated and operated by the exemplarymethod 1000. In the left panel of FIG. 11, an example of a graph 1100generated by the method 600 is shown. In this example, the graph 1100 issimilar to the graph 900 in FIG. 9. The graph 1100 is generated from theimported lists of three accessed entities 1101, 1102 and 1103. Thesearch tool 1150 may provide any form of user interface element that canreceive the input of the user from the input device 214. For instance,the search tool 1150 may provide a text box into which a user can typealphanumeric characters such as a word or words.

At operation 1020, the node data engine 230 may search in the profilesof the imported entities in the generated graph 1100 which have an entrythat matches with the keyword input in the search tool 1150. This isperformed by searching the information in the profiles as stored in theworking volatile memory allocated to the browser 1632. At operation1030, if one or more profiles are found to have the same text as theinput text, the method proceeds to operation 1040. Here, thecorresponding nodes in the graph 1100 are highlighted via thevisualization engine 240. If not, the result of search is reported atoperation 1050. In the example of FIG. 11, five entities 1104, 1105,1106, 1107 and 1108, are highlighted as a result of the search for thekeyword ‘E City.’

FIG. 12 is a flowchart showing a method 1200 performed by the networkgraph parser 234 of the electronic device 110 to filter the dataassociated with the entities in a plotted graph to produce a reducedgraph, according to some example embodiments. FIG. 13 shows examples ofgraphs 1300 generated by the method 600 and examples of reduced graphs1310 (e.g., a refined visual representation) and 1320 generated by theexemplary method 1200.

The filter instruction may be provided in response to a user inputselecting a filter option, for instance through interaction with a userinterface element in a sidebar, dock, pull-down menu etc. If the numberof entities displayed in the graph 1300 is large, the graph may be oflimited use to an analyst. The filtering method 1200 allows theisolation of the most significant entities and the removal of lesssignificant entities. Such operation of filtering or reducing data maylead to more efficient, focused and targeted approach in repositorywebsite user analysis. This applies to analysis using the network graphparser 234 and to subsequent analysis after export to the databasesystem 10. Furthermore, trimming the graph before exporting data to thedatabase system 10 may prevent the personal profile data of onlymarginally relevant or irrelevant individuals unnecessarily enteringinto the database system 10 for analysis. It may also provide regulationcompliance advantages since information relating to fewer entities isimported into the database system 10.

At operation 1210, the interface engine 210 generates a user interfaceelement 1350 configured to receive a user input specifying a connectionparameter, such as a minimum number of links that is of interest to theuser (e.g., a level of connectedness). Limiting the minimum number oflinks may assist in selecting the entities with the most meaningfulconnections in the network represented in the graph 1300. The userinterface element 1350 may receive the user input via the input device1614 or the cursor control 1616.

At operation 1220, the node data engine 230 identifies the entitieslinked to other entities by the number of connections specified by theuser input at operation 1210. All of the connections in FIG. 13correspond to links to one of the selected entities 1301, 1302 and 1303.Therefore, the number of links of an entity in the example of FIG. 13only corresponds to the number of connections to the user-selectedentities 1301, 1302 and 1303.

In the example of FIG. 13A, the maximum number of links between entitiesis three. Therefore, the user input may be “2” or both “2” and “3”. Theuser input of both “2” and “3”, as shown in FIG. 13A, may instruct thenode data engine 230 to identify the entities with two and three linksto selected entities. The user input of “3”, as shown in FIG. 13B, maycause the network graph parser 234 to identify only the entities withlinks to three selected entities. Returning to FIG. 12, at operation1230, the node data engine 230 searches the nodes (e.g., underlying nodedata in JSON format) corresponding to the identified entities. In FIG.13A, as a result of search in this operation, the nodes corresponding toentities having two and three links with the user-selected entities1301, 1302 and 1303, corresponding to groups of nodes 1304, 1305, 1306and 1307, have been highlighted by displaying them as empty circles.

In FIG. 13B, the entities having two links with the user-selectedentities 1301, 1302 and 1303, corresponding to group of node 1305, havebeen highlighted as empty circles. Returning to FIG. 12, at operation1240, the entities that are not identified at operation 1230 and thatare not the user-selected entities 1301, 1302 and 1303 may be removedfrom the graph 1300, according to some example embodiments. This may beachieved by the network graph parser 234 receiving a user input to‘inverse select’ the other entities that are not highlighted Atoperation 1230, and then receiving an input to delete the selectednodes/entities, the delete input being received via the input device1614 or the cursor control 1616. Alternatively, the network graph parser234 may receive a user input (e.g., a filter instruction) to remove allthe entities except the highlighted entities at operation 1230 and theuser-selected entities 1301, 1302 and 1303.

FIG. 13A shows an example of a graph 1310 reduced from the graph 1300according to the method 1200. In the user interface element 1350, “2”and “3” links have been specified by the user and the graph 1310 showsonly the user-selected entities 1301, 1302 and 1303 and the entitiesthat are linked to two or three of the accessed entities, groups ofnodes 1304, 1305, 1306 and 1307. In FIG. 13B, the graph 1320 shows anexample of a graph trimmed from the graph 1300 according to theexemplary method 1200. In the user interface element 1350, “3” linkshave been specified and the graph 1320 shows only the user-selectedentities 1301, 1302 and 1303 and the entities that are linked to allthree of the user-selected entities, namely the group of nodes 1305.

In case the profile description information have been imported alongwith the entities in the graph 1300, they may be removed along with theentity at operation 1240. After operation 1240, the reduced graphs 1310or 1320 and/or associated profile description information may beexported to the database system 10 via export engine 250. Though visualgraphs are depicted in FIGS. 13A and 13B, it is appreciated that theoperations may first be performed on the underlying data used togenerate the graphs. That is, the graph 1300 may be generated frominitial node data collected from a connections page. A connectionparameter may be received from the user that specifies the number ofconnections required to remain in the node data. Nodes not meeting theattribute specified by the connection parameter are removed. Theresulting refined node dataset is then used to generate graph 1310.

FIG. 14 is a flowchart representing an exemplary method 1400 performedby an electronic device 110 to export the data associated with theentities in the reduced graphs 1310 and 1320. This may correspond tooperation 350 discussed above in relation to FIG. 3. At operation 1410,the network graph parser 234 may receive a user input which instructsthe network graph parser 234 to export the reduced graphs 1310 or 1320and associated data such as profile description information of theentities corresponding to the nodes displayed in the graphs 1310 or1320.

At operation 1420, the interface engine 210 receives a user inputspecifying an analysis description. The analysis description may be freetext. It may relate to the origin, the history and the description ofthe data and the details regarding the repository website analysisperformed. The analysis description may assist in generating trails suchthat it can be monitored that the performed analysis complies with anyrules or regulations that may be relevant in the specific field ofanalysis. The analysis description also may be useful in case multiplesets of reduced and processed graphs are generated from differentstarting accessed entities, for example. If a specific entities appearin multiple sets of graphs, the analysis description of each graph mayprovide additional information therefore provide compounding value ofmultiple investigations.

At operation 1430, the network graph parser 234 may export the data tothe database system 10 via export engine 250. Operation 1430 may involveexporting data relating to entities corresponding to nodes displayed inthe graph to the database system 10 without exporting data relating toentities corresponding to nodes not displayed in the graph. In thedatabase system 10, the reduced graph and the associated data may betransformed according to the specific ontology of the deployment forfurther analysis.

Various modification and alternatives will be apparent to the personskilled in the art and all such modifications and alternatives areintended to be encompassed with the claims Some such modifications andalternatives will now be described.

Although in the above, the profiles for the user-selected entities aresourced from the same electronic repository website service provider,the scope is not limited to this. In other embodiments, profiles for anentity may be retrieved from two or more different repository servers130-1 to 130-n. In this case, the entity would ordinarily have differentidentities or usernames on the different electronic repository websites.However, the profiles can be determined by the network graph parser 234to be related to the same entity by information included in eitherprofile or in both profiles, or may be entered into the network graphparser 234 by the user of the network graph parser 234. Alternatively orin addition, two or more different entities from different electronicrepository servers 130 may be selected by the user of the network graphparser 234 as seed entities. In this case, information in profiles forlinked to entities may be used to connect profiles in one or more of therepository servers (e.g., repository server 130-1) to correspondingprofiles for the same entities in another repository server (e.g.,repository server 130-2).

In the above, when an entity is selected for analysis, all of theentities linked to by that profile are retrieved from the electronicrepository server 130 and displayed in a graph. Alternatively, a usermay specify a limit on the number of entities that are to be retrievedfrom the electronic repository server 130 by the network graph parser234 and displayed in a graph. This may be globally set as a setting bythe plugin, or it may be selected or entered by the user at the time ofselecting the entity. In the above, the histogram is formed from sameinformation in same fields or profiles. Alternatively or in addition,information such as geotag information from photos, comments, mentions,replies, and/or such like.

FIG. 15A shows an example browser 1500 for parsing node data using thenetwork graph parser 234, according to some example embodiments. In theexample of FIG. 15A, an analyst user navigates to the user profile of auser on a code repository website. For example, the analyst usernavigates to the URL 1505 (“repository/joan.labrador/”), which is aprojects webpage of the software developer “Joan Labrador”. In someexample embodiments, the analyst user is a user attempting to identifypatterns between software projects and the software developer is a userthat uploads the source code to a project webpage of a given softwareproject.

The projects webpage displays the user's uploaded software or projectdata 1510 as display elements (e.g., boxes, static text, hyperlinks).The title for each of the projects may contain a hyperlink that links tothe project page for the corresponding project. For example, in thefirst listed project, “Smartwatch Exercise App” may be a hyperlink thatlinks to a project page for that project. The project page for“Smartwatch Exercise App” may display source code uploaded by thesoftware developer “Joan Labrador”. The project page may further containlinks to the user profile pages of the seventeen developers that work onthat project.

The projects webpage is received as HTTP data from the repository server130. The webpage is generated from underlying source code in a format,such as HTML. To initiate parsing, the analyst user selects a pluginbutton 1515 which, as displayed, is integrated into the browser 1500.Responsive to the selection, the interface engine 210 displays a popupwindow 1520 having different parse options. According to some exampleembodiments, the first option “Graph” parses all users associated withthe user “Joan Labrador” and creates a visualization from the data asdiscussed above. The second option “Add to graph” adds Joan Labrador asa second entity. For example, the analyst user may have selected a firstuser to parse (e.g., collect node data of related developers), and thenwant to select Joan Labrador as a further entity to parse (e.g., collectnode data of developers related to Joan Labrador to add to the graph).

Assuming, to continue the example, the data analyst selects the firstoption “Graph”, the network graph parser 234 parses the source code thatgenerates the projects webpage to extract node data from Joan's projectsas discussed above. For example, the parse engine 220 can identify eachof Joan's projects, including (1) “Smartwatch Exercise App”, (2) JavaNote Taking client”, and (3) “Acme Corp. Enterprise CRM System”. Theparse engine can navigate to the project page for each of the projectsto identify users associated with Joan. For example, the parse engine220 can user the hyperlink “Smartwatch Exercise App” to navigate to theproject page for that project. Further, the parse engine can thenidentify user profile links on the project page (e.g., the 17 developersworking on the “Smartwatch Exercise App” project) and navigate to theuser pages to collect node data such as user name, profile page URL, foreach of the associated users. The parse engine may perform similaroperations to collect node data for the users associated with the othertwo code projects. The resulting data can then be used to generatevisualizations, as shown in FIG. 15B.

In FIG. 15B, displays a user interface 1550 showing a visualization 1555generated from the node data of users associated with Joan Labradorthrough one or more coding or software projects. Each circle or nodecorresponds to a user associated with Joan through a project. The userinterface 1550 may open in a second tab of the browser 1550. Asillustrated, the user interface 1550 includes a main area in which thevisualization is displayed, and a right bar area 1570. For example,selecting one of the buttons may display the user interface element 1350(FIG. 13B) which the analyst user can use to specify a connectionparameter. Father, as illustrated in the example of FIG. 15B, the rightbar area 1570 can be used to show parsed node data 1557 of the selectedentity “Joan Labrador.” The parsed node data 1557 may be parsed orextracted from the underlying source code of the webpage displayed inFIG. 15A (e.g. a user profile page). According to some exampleembodiments, if a user select a node from the visualization 1555, thecorresponding node data for the node is shown in the right bar area1570.

Further, according to some example embodiments, the right bar area maybe used to show other types of visualizations, such as the histogram990, instead of the node data. The analyst can then user the histogramto select groups to modify the visualization 1555. In some exampleembodiments, the network graph parser spiders to one or more hyperlinkfor each users listed in a project page and to collect parsed node datasimilar to Joan's parsed node data 1557.

FIG. 16 is a block diagram that illustrates a computer system 1600,which may constitute the electronic device 110, according to someexample embodiments. As illustrated, computer system 1600 includes a bus1602 or other communication mechanism for communicating information, andone or more hardware processors 1604 (including processor circuitry),coupled with bus 1602 for processing information. One or more hardwareprocessors 1604 can be, for example, one or more general purposemicroprocessors, each including processor circuitry. Computer system1600 also includes a main memory 1606, such as a random access memory(RAM) or other dynamic storage device, coupled to bus 1602 for storinginformation and instructions to be executed by processor 1604.

Main memory 1606 also can be used for storing temporary variables orother intermediate information during execution of instructions to beexecuted by processor 1604. Such instructions, when stored innon-transitory storage media accessible to one or more processors 1604,render computer system 1600 into a special-purpose machine that iscustomized to perform the operations specified in the instructions. Mainmemory 1606 may also be used for temporarily storing the whole of partof applications, such as the web browser 1632, including the networkgraph parser 234, while they are being executed by the electronic device110. As illustrated in FIG. 2, the network graph parser 234 may beintegrated or installed into the web browser 1632. For example, thenetwork graph parser 234 may be installed as a plugin or extension ofthe web browser 1632.

The main memory 1606 is a volatile memory in that data stored therein islost when power is no longer provided to the memory 1606. The mainmemory 1606 is used to temporarily store information that is beingprocessed by software applications, including the web browser 1632 andthe network graph parser 234. In relation to the web browser 1632 andthe network graph parser 234, information that is temporarily storedincludes webpages and ancillary content that is received from therepository servers 130-1 to 130-n. In relation to the web browser 1632and the network graph parser 234, information that is temporarily storedalso includes information parsed from webpages by the network graphparser 234 and information derived from such received information by theplugin, as is described in detail below.

Computer system 1600 further includes a read only memory (ROM) 1608 orother static storage device coupled to bus 1602 for storing staticinformation and instructions for processor 1604. The ROM 1608 is usedfor permanent storage of applications such as the web browser 1632,including the network graph parser 234, when the electronic device isnot powered on and/or when the applications are not being executed bythe processor 1604. The storage is of the computer code or instructionsthat constitute the applications. A storage device 1610, such as amagnetic disk, optical disk, or USB thumb drive (Flash drive), etc., isprovided and coupled to bus 1602 for storing information andinstructions.

Computer system 1600 can be coupled via bus 1602 to a display 1612, suchas an LCD or plasma display, or a touchscreen or cathode ray tube (CRT),for displaying information to a computer user. An input device 1614, forinstance a keyboard, including alphanumeric and other keys, is coupledto bus 1602 for communicating information and command selections toprocessor 1604. Another type of user input device is cursor control1616, such as a mouse, a trackball, or cursor direction keys forcommunicating direction information and command selections to processor1604 and for controlling cursor movement on display 1612. In someembodiments, the same direction information and command selections ascursor control may be implemented via receiving touches on a touchscreen without a cursor. It will be appreciated that the processor 1604,under control of software and/or operating system, causes display ofgraphics and text, and that the display 1612 displays such. Displaying agraph comprises displaying a graphical representation.

The term “non-transitory media” as used herein refers to any mediastoring data and/or instructions that cause a machine to operate in aspecific fashion. Such non-transitory media can comprise non-volatilemedia and/or volatile media. Non-volatile media includes, for example,optical or magnetic disks, such as storage device 1610. Volatile mediaincludes dynamic memory, such as main memory 1606. Common forms ofnon-transitory media include, for example, a floppy disk, a flexibledisk, hard disk, solid state drive, magnetic tape, or any other magneticdata storage medium, a CD-ROM, any other optical data storage medium,any physical medium with patterns of holes, a RAM, a PROM, and EPROM, aFLASH-EPROM, NVRAM, any other memory chip or cartridge, and networkedversions of the same.

Non-transitory media is distinct from, but can be used in conjunctionwith, transmission media. Transmission media participates intransferring information between storage media. For example,transmission media includes coaxial cables, copper wire and fiberoptics, including the wires that comprise bus 1602. Transmission mediacan also take the form of acoustic or light waves, such as thosegenerated during radio-wave and infra-red data communications. Variousforms of media can be involved in carrying one or more sequences of oneor more instructions to processor 1604 for execution. For example, theinstructions can initially be carried on a magnetic disk or solid statedrive of a remote computer. The remote computer can load theinstructions into its dynamic memory and send the instructions over atelephone line using a modem. A modem local to computer system 1600 canreceive the data on the telephone line and use an infra-red transmitterto convert the data to an infra-red signal. An infra-red detector canreceive the data carried in the infra-red signal and appropriatecircuitry can place the data on bus 1602. Bus 1602 carries the data tomain memory 206, from which processor 1604 retrieves and executes theinstructions. The instructions received by main memory 1606 canoptionally be stored on storage device 1610 either before or afterexecution by processor 1604.

Computer system 1600 also includes a communication interface 1618coupled to bus 1602. Communication interface 1618 provides a two-waydata communication coupling to a network link 1621 that is connected toa local network 1622. For example, communication interface 1618 can bean integrated services digital network (ISDN) card, cable modem,satellite modem, or a modem to provide a data communication connectionto a corresponding type of telephone line. As another example,communication interface 1618 can be a local area network (LAN) card toprovide a data communication connection to a compatible LAN. Wirelesslinks can also be implemented. In any such implementation, communicationinterface 1618 sends and receives electrical, electromagnetic or opticalsignals that carry digital data streams representing various types ofinformation.

Network link 1621 typically provides data communication through one ormore networks to other data devices. For example, network link 1621 canprovide a connection through local network 1622 to data equipmentoperated by an Internet Service Provider (ISP) 1626. ISP 1626 in turnprovides data communication services through the world wide packet datacommunication network now commonly referred to as the “Internet” 1628.Local network 1622 and Internet 1628 both use electrical,electromagnetic or optical signals that carry digital data streams. Thesignals through the various networks and the signals on network link1621 and through communication interface 1618, which carry the digitaldata to and from computer system 1600, are example forms of transmissionmedia.

Computer system 1600 can send messages and receive data, includingprogram code, through the network(s), network link 1621 andcommunication interface 1618. In the Internet example, a server 1627might transmit a requested code for an application program throughInternet 1628, ISP 1626, local network 1622 and communication interface1618. The received code can be executed by processor 1604 as it isreceived, and/or stored in storage device 1610, or other non-volatilestorage for later execution.

The network graph parser 234 is integrated into the web browser 1632 toform part of the web browser 1632. The user can first download thenetwork graph parser 234 from an appropriate web site or other source(e.g. portable storage such as a thumb drive or a storage device on alocal network) and then can proceed to install the network graph parser234. Since a typical network graph parser 234 is designed to becompatible to a specific web browser 1632 (e.g., Google™ Chrome™,Mozilla™ Firefox™, Microsoft™ Internet Explorer™, etc.), the networkgraph parser 234 can become a part of the web browser 1632 automaticallyafter the network graph parser 234 is installed.

Above, various actions are described as being performed by the networkgraph parser 234 and/or the web browser 1632. It will be appreciatedthat this is shorthand for computer program instructions that form partof the network graph parser 234 or the browser 1632, as the case may be,being executed by the processor 1604 and causing the processor 1604 totake the action. In doing so, some or all of the computercode/instructions constituting the network graph parser 1634 and thebrowser 1632 are copied from the ROM 1608 and stored in the main memory206, which is a volatile memory, such that the computercode/instructions constituting the network graph parser 234 and thebrowser 1632 can be executed by the processor 1604. In executing thecomputer code/instructions constituting the network graph parser 234 andthe browser 1632, the processor 204 is controlled to store data (otherthan the computer code/instructions constituting the network graphparser 234 and the browser 1632) temporarily in the main memory 1606. Asmentioned above, the main memory 1606 is volatile memory and as suchdata stored therein is lost when the main memory 1606 is de-powered.

Certain embodiments are described herein as including logic or a numberof components, modules, or engines. Engines can constitute eithersoftware engines (e.g., code embodied on a machine-readable medium) orhardware engines. A “hardware module” is a tangible unit capable ofperforming certain operations and can be configured or arranged in acertain physical manner. In various example embodiments, one or morecomputer systems (e.g., a standalone computer system, a client computersystem, or a server computer system) or one or more hardware engines ofa computer system (e.g., a processor or a group of processors) can beconfigured by software (e.g., an application or application portion) asa hardware module that operates to perform certain operations asdescribed herein.

In some embodiments, a hardware engines can be implemented mechanically,electronically, or any suitable combination thereof. For example, ahardware engines can include dedicated circuitry or logic that ispermanently configured to perform certain operations. For example, ahardware engines can be a special-purpose processor, such as aField-Programmable Gate Array (FPGA) or an Application SpecificIntegrated Circuit (ASIC). A hardware engines may also includeprogrammable logic or circuitry that is temporarily configured bysoftware to perform certain operations. For example, a hardware enginescan include software executed by a general-purpose processor or otherprogrammable processor. Once configured by such software, hardwaremodules become specific machines (or specific components of a machine)uniquely tailored to perform the configured functions and are no longergeneral-purpose processors. It will be appreciated that the decision toimplement a hardware engines mechanically, in dedicated and permanentlyconfigured circuitry, or in temporarily configured circuitry (e.g.,configured by software) can be driven by cost and time considerations.

Accordingly, the phrase “hardware engine” should be understood toencompass a tangible entity, be that an entity that is physicallyconstructed, permanently configured (e.g., hardwired), or temporarilyconfigured (e.g., programmed) to operate in a certain manner or toperform certain operations described herein. As used herein,“hardware-implemented engine” refers to a hardware module. Consideringembodiments in which hardware engines are temporarily configured (e.g.,programmed), each of the hardware modules need not be configured orinstantiated at any one instance in time. For example, where a hardwaremodule comprises a general-purpose processor configured by software tobecome a special-purpose processor, the general-purpose processor may beconfigured as respectively different special-purpose processors (e.g.,comprising different hardware modules) at different times. Softwareaccordingly configures a particular processor or processors, forexample, to constitute a particular hardware module at one instance oftime and to constitute a different hardware module at a differentinstance of time.

Hardware modules can provide information to, and receive informationfrom, other hardware modules. Accordingly, the described hardwaremodules can be regarded as being communicatively coupled. Where multiplehardware modules exist contemporaneously, communications can be achievedthrough signal transmission (e.g., over appropriate circuits and buses)between or among two or more of the hardware modules. In embodiments inwhich multiple hardware modules are configured or instantiated atdifferent times, communications between such hardware modules may beachieved, for example, through the storage and retrieval of informationin memory structures to which the multiple hardware modules have access.For example, one hardware module can perform an operation and store theoutput of that operation in a memory device to which it iscommunicatively coupled. A further hardware module can then, at a latertime, access the memory device to retrieve and process the storedoutput. Hardware modules can also initiate communications with input oroutput devices, and can operate on a resource (e.g., a collection ofinformation).

The various operations of example methods described herein can beperformed, at least partially, by one or more processors that aretemporarily configured (e.g., by software) or permanently configured toperform the relevant operations. Whether temporarily or permanentlyconfigured, such processors constitute processor-implemented modulesthat operate to perform one or more operations or functions describedherein. As used herein, “processor-implemented module” refers to ahardware module implemented using one or more processors.

Similarly, the methods described herein can be at least partiallyprocessor-implemented, with a particular processor or processors beingan example of hardware. For example, at least some of the operations ofa method can be performed by one or more processors orprocessor-implemented modules. Moreover, the one or more processors mayalso operate to support performance of the relevant operations in a“cloud computing” environment or as a “software as a service” (SaaS).For example, at least some of the operations may be performed by a groupof computers (as examples of machines including processors), with theseoperations being accessible via a network (e.g., the Internet) and viaone or more appropriate interfaces (e.g., an Application ProgramInterface (API)).

The performance of certain of the operations may be distributed amongthe processors, not only residing within a single machine, but deployedacross a number of machines. In some example embodiments, the processorsor processor-implemented modules can be located in a single geographiclocation (e.g., within a home environment, an office environment, or aserver farm). In other example embodiments, the processors orprocessor-implemented modules are distributed across a number ofgeographic locations.

The modules, methods, applications and so forth described in conjunctionwith FIGS. 1-15 are implemented in some embodiments in the context of amachine and an associated software architecture. The sections belowdescribe representative software architecture and machine (e.g.,hardware) architecture that are suitable for use with the disclosedembodiments.

Throughout this specification, plural instances may implementcomponents, operations, or structures described as a single instance.Although individual operations of one or more methods are illustratedand described as separate operations, one or more of the individualoperations may be performed concurrently, and nothing requires that theoperations be performed in the order illustrated. Structures andfunctionality presented as separate components in example configurationsmay be implemented as a combined structure or component. Similarly,structures and functionality presented as a single component may beimplemented as separate components. These and other variations,modifications, additions, and improvements fall within the scope of thesubject matter herein.

Although an overview of the inventive subject matter has been describedwith reference to specific example embodiments, various modificationsand changes may be made to these embodiments without departing from thebroader scope of embodiments of the present disclosure. Such embodimentsof the inventive subject matter may be referred to herein, individuallyor collectively, by the term “invention” merely for convenience andwithout intending to voluntarily limit the scope of this application toany single disclosure or inventive concept if more than one is, in fact,disclosed.

The embodiments illustrated herein are described in sufficient detail toenable those skilled in the art to practice the teachings disclosed.Other embodiments may be used and derived therefrom, such thatstructural and logical substitutions and changes may be made withoutdeparting from the scope of this disclosure. The Detailed Description,therefore, is not to be taken in a limiting sense, and the scope ofvarious embodiments is defined only by the appended claims, along withthe full range of equivalents to which such claims are entitled.

As used herein, the term “or” may be construed in either an inclusive orexclusive sense. Moreover, plural instances may be provided forresources, operations, or structures described herein as a singleinstance. Additionally, boundaries between various resources,operations, modules, engines, and data stores are somewhat arbitrary,and particular operations are illustrated in a context of specificillustrative configurations. Other allocations of functionality areenvisioned and may fall within a scope of various embodiments of thepresent disclosure. In general, structures and functionality presentedas separate resources in the example configurations may be implementedas a combined structure or resource. Similarly, structures andfunctionality presented as a single resource may be implemented asseparate resources. These and other variations, modifications,additions, and improvements fall within a scope of embodiments of thepresent disclosure as represented by the appended claims. Thespecification and drawings are, accordingly, to be regarded in anillustrative rather than a restrictive sense.

1. (canceled)
 2. A method comprising: receiving, by a client device,from a network site, node connection data of an initial user objectassociated with the network site, the node connection data beingincluded in a page of the network site; identifying, by the clientdevice, additional user objects included in the node connection data ofthe initial user object; receiving, from a user of the client device,selection of a connection parameter shared by the initial user objectand a portion of the additional user objects on the network site;receiving, from the user of the client device, an inversion instructionto remove non-selected portions that are not in the specified portion ofthe selected additional user objects; and generating, by the clientdevice, a visual representation that depicts connections between theinitial user object and the portion of the additional user objects. 3.The method of claim 2, wherein the connection parameter is a quantity ofconnections to other user objects, and the connection parameterspecifies that for inclusion in the portion each user object has morethan a specified quantity of connections to other user objects.
 4. Themethod of claim 2, wherein the connection parameter is participation inan object group on the network site, and the connection parameterspecifies that for inclusion in the portion each user object is aparticipant in a selected group on the network site.
 5. The method ofclaim 2, wherein the visual representation is a network graph havingnodes connected by edges, the nodes corresponding to the initial userobject and the additional user objects, the edges corresponding toconnections among the initial user object and the additional userobjects.
 6. The method of claim 2, further comprising: receiving, fromthe user of the client device, one or more search terms to search in theadditional user objects; identifying one or more additional user objectsthat match the one or more search terms; and storing the one or moreadditional user objects as the portion specified by the selectioninstruction.
 7. The method of claim 2, wherein the node connection datais user data and the initial user object is the user of the networksite, and wherein the additional node connection data is additional userdata and the additional user objects are other users that are connectedto the user on the network site.
 8. The method of claim 2, wherein eachof the node connection data and the additional node connection datacomprise at least one of the following: a username of a given user onthe network site, a uniform resource locator (URL) of a profile page ofthe given user on the network site, images uploaded by the given user tothe network site, text uploaded by the given user to the network site.9. The method of claim 2, further comprising: displaying, on a displaydevice of the client device, the page comprising the node connectiondata from the network site.
 10. A client device comprising: one or moreprocessors; one or more input devices; a display device; and a memorycomprising instructions that, when executed by the one or moreprocessors, cause the client device to perform operations comprising:receiving, from a network site, node connection data of an initial userobject associated with the network site, the node connection data beingincluded in a page of the network site; identifying additional userobjects included in the node connection data of the initial user object;receiving, from the one or more input devices, selection of a connectionparameter shared by the initial user object and a portion of theadditional user objects on the network site; receiving, from the one ormore input devices, an inversion instruction to remove non-selectedportions that are not in the specified portion of the selectedadditional user objects; and generating a visual representation thatdepicts connections between the initial user object and the portion ofthe additional user objects; and displaying the visual representation onthe display device.
 11. The client device of claim 10, wherein theconnection parameter is a quantity of connections to other user objects,and the connection parameter specifies that for inclusion in the portioneach user object has more than a specified quantity of connections toother user objects.
 12. The client device of claim 10, wherein theconnection parameter is participation in an object group on the networksite, and the connection parameter specifies that for inclusion in theportion each user object is a participant in a selected group on thenetwork site.
 13. The client device of claim 10, wherein the visualrepresentation is a network graph having nodes connected by edges, thenodes corresponding to the initial user object and the additional userobjects, the edges corresponding to connections among the initial userobject and the additional user objects.
 14. The client device of claim10, the operations further comprising: receiving, from the user of theclient device, one or more search terms to search in the additional userobjects; identifying one or more additional user objects that match theone or more search terms; and storing the one or more additional userobjects as the portion specified by the selection instruction.
 15. Theclient device of claim 10, wherein the node connection data is user dataand the initial user object is the user of the network site, and whereinthe additional node connection data is additional user data and theadditional user objects are other users that are connected to the useron the network site.
 16. The client device of claim 10, wherein each ofthe node connection data and the additional node connection datacomprise at least one of the following: a username of a given user onthe network site, a uniform resource locator (URL) of a profile page ofthe given user on the network site, images uploaded by the given user tothe network site, text uploaded by the given user to the network site.17. The client device of claim 10, the operations further comprising:displaying, on a display device of the client device, the pagecomprising the node connection data from the network site.
 18. Anon-transitory computer readable storage medium comprising instructionsthat, when executed by one or more processors of a device, cause thedevice to perform operations comprising: receiving, from a network site,node connection data of an initial user object associated with thenetwork site, the node connection data being included in a page of thenetwork site; identifying additional user objects included in the nodeconnection data of the initial user object; receiving selection of aconnection parameter shared by the initial user object and a portion ofthe additional user objects on the network site; receiving an inversioninstruction to remove non-selected portions that are not in thespecified portion of the selected additional user objects; andgenerating a visual representation that depicts connections between theinitial user object and the portion of the additional user objects. 19.The non-transitory computer readable storage medium of claim 18, whereinthe connection parameter is a quantity of connections to other userobjects, and the connection parameter specifies that for inclusion inthe portion each user object has more than a specified quantity ofconnections to other user objects.
 20. The non-transitory computerreadable storage medium of claim 18, wherein the connection parameter isparticipation in an object group on the network site, and the connectionparameter specifies that for inclusion in the portion each user objectis a participant in a selected group on the network site.
 21. Thenon-transitory computer readable storage medium of claim 18, wherein thevisual representation is a network graph having nodes connected byedges, the nodes corresponding to the initial user object and theadditional user objects, the edges corresponding to connections amongthe initial user object and the additional user objects.